3 research outputs found

    Doctor of Philosophy

    Get PDF
    dissertationGene expression data repositories provide large and ever increasing data for secondary use by translational informatics methods. For example, Gene Expression Omnibus (GEO) houses over 37,000 experiments with the goal of supporting further research. To use these published results in a larger meta-analysis, consolidation of the data are needed; however, the data are largely unstructured, thus hindering data integration efforts. Here, I propose the use of a novel pipeline, Ontology Based Data Integration (OBDI), which uses an ontological approach to combine the samples across multiple GEO experiments. The ODBI pipeline uses machine learning algorithms that permit researchers to consolidate and analyze data across GEO experiments. Here, I demonstrate how using an ontological approach to integrate samples across experiments can be used to explore the immune response at a molecular level. As part of this process, a Web Ontology Language (OWL) was developed for each data platform used. OWL serves as a core component in successfully processing different sample types. Immunological experiments from GEO were consolidated to evaluate this methodology. The experiments included samples analyzed on expression arrays, BeadChips, and sequencing technologies. The integration of a complex biological system and the incorporation of different biological data types will validate the potential of OBDI. iv The nature of biological data is highly dimensional. OBDI incorporates tools and techniques that can handle the analysis of various biological data. The machine learning analysis performed within the OBDI pipeline successfully evaluated the newly annotated experiments and provides insights that can be further explored. The OBDI pipeline can help researchers annotate experiments using ontologies and analyze the annotated experiments. To successfully build the pipeline, ontologies served as the backbone of integrating samples from GEO Series records into machine learning experiments using ML-Flex. By using the OBDI pipeline, researchers can access the uncurated experiments from GEO (GEO Data Series) and annotate the data using the terms in the ontologies. This mechanism allows for the organization of data sets in relationship to new experiments independent of GEO's GDS curation process. The OBDI system allows ontologies to grow organically around a cluster of experiments. These experiments are then further analyzed in ML-Flex using machine learning algorithms. The curated experiments are analyzed in silico and the computational analyses are supported by the OBDI ontological system

    Universal Germline Testing of Unselected Cancer Patients Detects Pathogenic Variants Missed by Standard Guidelines without Increasing Healthcare Costs

    No full text
    Purpose: To accurately ascertain the frequency of pathogenic germline variants (PGVs) in a pan-cancer patient population with universal genetic testing and to assess the economic impact of receiving genetic testing on healthcare costs. Methods: In this prospective study, germline genetic testing using a 105-gene panel was administered to an unselected pan-cancer patient population irrespective of eligibility by current guidelines. Financial records of subjects were analyzed to assess the effect of PGV detection on cost of care one year from the date of testing. Results: A total of 284 patients participated in this study, of which 44 patients (15%) tested positive for a PGV in 14 different cancer types. Of the patients with PGVs, 23 patients (52%) were ineligible for testing by current guidelines. Identification of a PGV did not increase cost of care. Conclusion: Implementation of universal genetic testing for cancer patients in the clinic, beyond that specified by current guidelines, is necessary to accurately assess and treat hereditary cancer syndromes and does not increase healthcare costs

    The Development of an Infrastructure to Facilitate the Use of Whole Genome Sequencing for Population Health

    No full text
    The clinical use of genomic analysis has expanded rapidly resulting in an increased availability and utility of genomic information in clinical care. We have developed an infrastructure utilizing informatics tools and clinical processes to facilitate the use of whole genome sequencing data for population health management across the healthcare system. Our resulting framework scaled well to multiple clinical domains in both pediatric and adult care, although there were domain specific challenges that arose. Our infrastructure was complementary to existing clinical processes and well-received by care providers and patients. Informatics solutions were critical to the successful deployment and scaling of this program. Implementation of genomics at the scale of population health utilizes complicated technologies and processes that for many health systems are not supported by current information systems or in existing clinical workflows. To scale such a system requires a substantial clinical framework backed by informatics tools to facilitate the flow and management of data. Our work represents an early model that has been successful in scaling to 29 different genes with associated genetic conditions in four clinical domains. Work is ongoing to optimize informatics tools; and to identify best practices for translation to smaller healthcare systems
    corecore